per 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
Interaodooal Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCI) 



(51) International Patent aassificatioo ^ 
C12Q 1/68, C07H 19/00 



Al 



(11) International PabUcation Number: WO 89/12694 

(43) Intematiooai PabUcation Date: 28 December 1989 (28, 1 2.89) 



(21) International AppUcation Nomber: PCT/US89/02602 

(22) International Filing Dale: 20 June 1989 (20.06.89) 



(30) Priority data: 
209^47 



20 June 1988 (20.06.88) US 



<60) Parent AppUcation or Giant 
(63) Related by Continuation 
US 

Filed on 



209^7 (CIF) 
20 June 1988(20.06.88) 



(71) Appiic^t 6^/- aU designated States except US): GENQhiYK^ 

INC. [US/USJ; 453 Remillard Drive, HiUsboro, CA 
94010 (US). 

(72) Inventor; and ^ 

(75) Inventor/AppUcant (for US onfyj : BRENNAN, Thomas, M. 
(US/USl; 453 Remillard Drive, ffiUsboro, CA 94010 
(US). 



(74) Agents: McDONNELL, John, J. et al.; Allegretti & Wit- 
ooff, Ltd., 10 South Wacker Drive, Chicago, IL 60606 
(US). 

(81) Designated States: AT (European patent), AU, BE (Euro- 
pean patent), CH (European patent), DE (European pa- 
tent), FR (European patent), GB (European patent). IT 
(European patent), JP, LU (European patent), NL (Eu- 
ropean patent), SB (European patent), US. 



PabU^ed 

With international search report 

Before the expiration of the time limit for amending the 

dasms-and to be republished in J*e event of the receipt of 
amendments* 



<54)Tlde: DETERMINING DNA SEQUENCES BY MASS SPECTROMETRY 




■ — Hririr 

TTi/e 64 65 66 68 



(57) Abstract 

This invention relates to the methods, apparatus, reagents and mixtures of reagents for sequencing natural or reajmbinant 
DNA and other polynucleotides. In particular, this invention relates to a method for sequencmg PalJ"'"^ °" 
spectrometry to determine which of the four bases (adenine, guanine, cytosine or thymine) is a component of the tennmal nudeo- 
tide. In particular, the present invention relates to identifying the individual nucleotides by the mass of stable nuclide markers 
contained within either the dideoxynucleotides. the DNA primer, or the deoxynucleotide add«l to die pn«ner 3^>^/«Y«"^°n « 
particularly useful in identifying specific DNA sequences in very smaU quantities in biologiral proda<^ ^"^^Jj^I^Z^' 
tion or otiicr genetic engineering techniques. IJie invention is dierefore useful m evaluating safety and ther health concerns relat- 
ed to tii presence of DNA in products resulting from genetic engineering techniques. 



FOR THE PURPOSES OF INFORAiAJTON ONLY 

Codes used to rdentify States party to the PCT on the front pages of pamphlets publishing intemaDonal 
applications under the PCT. 



AT 


Austria 


FT 


Finhnd 


ML 


MaO 


KU 


Ausoafe 


FR 


France 


MR 


Maurimnta 


BB 


Barbados 


GA 


Gabon 


MW 


Mabwi 


BE 


Rclghjm 


GB 


United Kingdoa 




Netherlands 


BF 


Burkina Fassa 


m 


Hungary 


NO 


Norway 


BG 


Bulgaria 


rr 


Italy 


RO 


Romanift 


BJ 


fictun 


jp 


iapan 


SD 


Sudan 


BR 


Brazil 


KP 


t>moccatic Peoptc*ls RqmbGc 


SE. 


Sweden 


CF 


Ccnoal Afixan Republic 




ofKoreat 




Senegal 


CG 


Oongo 


KR 


Republic of Kotra 


as 


Sovist Untoo 


CH 


Switzetiaxid 


U 




TO 


Chad 


CM 


Camexoon 


UC 


SriLank« 


TC 


Tog5> 


DE 


Ccxmany; Fedetal Repub&c of 


US 


Luxembours 


X3S 


United Slates of America 


DK 


Demnark 


MC 


Monaco 






ES 


Spam 


MG 









wo 89/12694 



PCr/LS89/02602 



-1- 



10 



20 



25 



30 



uESEmmms dna sbqqences by mass sPECTRCMEaror 

BArKGROUNn OF THE TNVKNTION 
A. FiftM of tb f^ Invention 

This invention relates to the field of the. determination of DNA 
sequences and the uses of automated techniques for such determination. 

The ability to sequence DNA has become a core technology in 
molecular biology, and has contributed greatly to the understanding of DNA 
structural organization and gene function. The facility with which DNA 
sequencing may be accomplished will substantially affect the rate of 
development of rrlated technologies, including the production of new 
therapeutic agents, useful plant varieties and microorganisms via 
recombinant DNA technology and the understanding of human genetic 
15 disorders and pathology through gene mapping and chromosomal sequence 
analysis. 

Initially, researchers focused on reading the genetic code and the 
translation of the nucleotide sequence into the amino acid sequence of a 
protein. This occurs by a process of DNA transcription into mRNA, and 
then actual synthesis of the protein on ribosomes. In eucaryotic cells, 
large specific segments of the initial transcript of mRNA, termed introns. 
are transcribed but are excised during an intermediary processing step. 
Much of the chromosomal DNA is not translated, and its specific function is 
largely unknown. This "intervening- or intron DNA was first thought to be 
excess genetic material. However, as biologists begin to unravel the details 
of cell differentiation and the processes controlling gene transcription it is 
now believed that the specific sequences of certain portions of some of 
these large regions of transcribed but untranslated DNA may also provide 
important regulatory signals. 

The potential applications which derive from DNA sequencing have 
only begun to be explored. On large scale, analysis of human chromosomal 
DNA is considered vital to understanding human pathological conditions, 
including genetic diseases, AIDS and cancer, because often only subtle 
differences, even single nucleotide substitutions, can lead to serious 
35 dis rders. Serious consideration is now being given to the sequencing of 
the entire human genome - approximately 3 billion base pairs. The success 
of this project will depend on rapid, sensitive, inexpensive automated 
methods to sequence DNA. 



wo 89/12694 



PCr/US89/02602 



The fundamental approach to determination of DNA sequence has been 
well established. Restriction endonucleases are employed to cleave 
chromosomal DNA into specific smaller segments, and recombinant cloning 
techniques are then used to purify and generate analyzable quantities of 
5 DNA. The specific sequence of each segment can then be determined by 
either the Maxam-Gilbert chemical cleavage, or preferably, the Sanger 
dideoxy terminated enzymatic method* In either case, a set of all possible 
fragments ending in a specific base are generated. The individual fragments 
can be resolved clcctrophoretically by molecular weight, and the sequence 
10 on the original DNA segment is then derived by IcuGwing the identity of the 
terminal base in each fragment: 

In its broadest aspect, this invention is directed to methods and 
reagents for sequencing DNA and other polynucleotides. In particular, this 
invention describes reagents and mediods for automating and increasing "the 
15 sensitivity of both the Sanger, Proc NatL AcadL Sci. USA, 74. 546S (1977) 
and Gish and Eckstein, Science, 240, 1520-1522 (19S8), procedures for 
sequencing polynucleotides. The methods of the present invention are based 
on ma s s spectrometric determination of each of the four component terminal 
nucleotide residues, .where the information regarding the identity of the 
20 individual nucleotides is contained in the mass of stable nuclide markers. 
B, Summary Of The Prior Art 

In the Sanger dideoxy method (Proc. NatL Acad. Sci. USA, 74, 5463 
(1977)), the DNA to be sequenced is exposed to a DNA polymerase, a cDNA 
primer, and a mixture of the four component deoxynucleotides, pljis one af 
25 the four possible 2,3-dideoxy nucleotides. The DNA to be sequenced is 
.typically a single stranded DNA clone prepared in the phage vector M13» 
although Chen and Seeburg have disclosed a method for applying the Sanger 
method to supercoiled plasmid DNA (DNA 4;165-170 (19a5)). In addition, 
Innis et aL, Proc NatL Acad, ScL, USA 85. 9436-9440 (1988) have disclosed 
30 a method for direct sequencing of chromosomal DNA amplified by the 
polymerase chain reaction. For any DNA template, however, the principle 
behind the dideoxy chain termination method remains the same. There is a 
competition for incorporation of the normal deoxy- and the dideoxy- 
nucleotide by the polymerase into the growing complementary chain. When 
35 a dideoxy nucleotide is incorporated, further chain xtension is prevented. 
Since there is a finite probability that this chain terminating event may 
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occur at each complementary site of the appropriate base, a mixture of all 
possible fragments ending in that dideoxy base will be generated. This 
mixture of fragments can be separated by size via gel electrophoresis. 
When the experiment is repeated with each dideoxy base, four mixtures of 
5 fragments, each terminating in a specific residue are produced. When this 
set of mixtures is chromatographcd in four adjacent lancs» so that fragment 
lengths in the four mixtures can be correlated with each other, the 
sequence of the original DNA is determined by relating the fragment length 
to the identity of the terminating dideoxy base, 

10 Maxam and Gilbert, Methods in Enzymology, fii. 499-500 (1980), 

disclosed a method for DNA sequcncins using chemical cleavage. In this 
method, each end of a DNA fragment to be sequenced is labeled. This DNA 
fragment is then cleaved preferentially at one of the nucleotides, under 
conditions favoring one cleavage per strand. This procedure is then 

15 repeated for each of the other three nucleotides. The four samples are 
then run side by side on an clcctrophoretic gel. Autoradiography identifies 
the position of a particular nucleotide by the length of the fragments 
produced by cleavage at that particular nucleotide. This method suffers 
from the same drawbacks as the Sanger method. 

20 The position of the fragment in gel electrophoresis is usually revealed 

by staining or by autoradiography. In autoradiography methods, the 
fragments have typically been labeled with ^^p or ^^S radionuclides where 
cither the DNA primer or one of the component deoxynucleotides have been 
tagged, and that label incorporated in a specific or random fashion. After 

25 fractionation of the fragment on acrylamide gels, the gels are used to 
expose films. This presents a number of difficulties. For example, the 
short half-life of requires that the sequencing experiment be 

anticipated days in advance so that fresh label can be used. Additionally, 
the high energy beta radiation emitted by the ^^P leads to cission of the 

30 phosphodiester linkages within the DNA fragments synthesized in the 
sequencing reaction and thus requires immediate fractionation of sequencing 
reaction products. The use of (Ornstein, et aU Biotechniques, 3, 476 
(1985), which has a longer half-life and less energetic emission somewhat 
ameliorates these problems, but requires much longer times of exposure to 

35 film for the development of a usable autoradiograph, often in the range of 
one to three days. Whichever radionuclide is used, the fact that a single 
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type of label is used for each sequencing reaction requires that each sec of 
reaction products.be fractionated in a separate lane on the sequencing gel. 
Common problems in running sequencing gels include uneven heating and 
the presence of impurities, either of which can cause adjacent lanes on the 
5 sequencing gel to run in an uneven fashion making the comparison of 
fragment migration in adjacent lanes, and thus DNA sequence determination, 
difficult or impossible* The use of unstable radionuclides also poses a 
health risk to the investigaton 

An alternate method of detection was developed by the California 
10 Institute of Technology group CSmife, et aL. Narare, 321, 674 (19S6)) in 
which the terminal base residues are labeled with a fluorescent marker 
attached to the DNA primer. In four fluorescent markers of different 
spectral emission maxima are used, then the four separate sets of 
polymerase fragments can be combined and co-chromatographed. This 
15 method is also disclosed in EPO Patent No. 87300998.9. 

A second variation of -the fluorescent tagging approach has recently 
been reported by the DuPont group (Science, 225. 336 (1987)) wherein , a 
unique fluorescent moiety is attached directly to the dideoxy nucleotide. 
This may represent an improvement over the CalTech primer tagging 
20 approach in that a single polymerase experiment can. now be run with a 
mixture of the four dideoxy terminatrng bases. However, one trade-off for 
this simplification is potential replication errors by the polymerase, arising 
from mis-incorporation of the modified dideoxynucleotide base analogs. 

These modified Sanger methods are an improvement over the original 
25 Sanger method in the extent to which DNA can be sequenced because the 
chromatographic ambiguities have been reduced. However, a number of 
limitations are associated with the use of fluorescent labels in these 
modified Sanger reactions. In particular, there are chromatographic 
differences among fragments arising from the unique mobilities of the 
30 different organic fluorescent markers. Moreover, there are difficulties in 
distinguishing individual fluorescent markers because of overlap, in their 
spectral bandwidths. Finally, there is a low sensitivity of detection 
inherent in the extinction coefficients of the fluorescent markers. 

All of the above variants of the Sanger method for sequencing have 
35 used slab gel .electrophoresis to. effect size separation of the DNA 
fragments. The casting, and loading of slab gels is a skilled but 
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intrinsically manual operation. The only aspect of this process which has 
been automated with any success is the reading of the gel by certain 
commercial devices with some type of laser scanner/spectrophotometer. 

A labeling method is needed which eliminates chromatographic 
ambiguity by imparting to each sequencing reaction product its own specific 
tag, but in which this specific tag is "invisible" to the chromatographic 
apparatus, Lfi- docs "Ot chromatographic mobility of the 

different sequencing products differentially. Additionally, a label detection 
system is needed which is much more sensitive than the fluorescence 
system, and which can make distinction in labels based upon characteristics 
which separate them discretely, rather than by trying to distinguish 
between broad overlapping traits. Ideally, a stable, non-radioactive label 
would be used eliminating the short useful lifetime of the label and 
products containing the label, as well as potential health risks to 
15 investigators. 

Eckstein and Goody. i3iochemistry. 15. 1685 (1976), discloses a method 
of chemical synthesis for adenosine-5MO-I-thiotriphosphate) and adenosine- 
5'-(0-2-thiotriphosphate). 

Eckstein, Accounts Chem. Res., 12. 204 (1978), discloses a group of 
20 phosphorothioate analogs of nucleotides. 

Gish and Eckstein. Science, m 1520-1522 (1988). disclose an 
alternative method for sequencing DNA and RNA employing base specific 
chemical cleavage of phosphothioate analogs of the nucleotides which were 
incorporated in a cDNA sequence. 

Japanese Patent No. 59-131,909 (1986), discloses a nucleic acid 
detection apparatus which detects nucleic acid fragments which are 
separated by electrophoretic techniques, liquid, chromatography, or high 
speed gel filtration. Detection is achieved by utilizing nucleic acids into 
which S, Br, I, or Ag. Au. Pt, Os, Hg or similar metallic elements have 
been introduced. These elements are generally absent in natural nucleic 
acids. Introduction of one of these elements into a nucleotide of a nucleic 
acid allows that nucleic acid or fragment thereof to be detected by means 
of atomic absorption, plasma emission or mass spectroscopy. However, this 
reference does not suggest or disclose any application of the described 
35 methods or apparatus to the sequencing of DNA, such as by the Sanger 
method. Specifically, it docs not teach that a plurality of specific isotopes 
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may be used to identify the specific terminal nucleotide residues. Nor does 
it teach that by total combustion of DNA to oxides of carbon, hydrogen, 
nitrogen and phosphorus, the detection sensitivity by mass spectrometry for 
trace elements, such as sulfur which is not normally found in DNA, is 
5 vastly improved. The combustion step, which is one aspect of the present . 
application, is essential to eliminate the myriad of fragment ions from DNA. 
These fragment ions would normally mask the presence of trace ions of SOj 
in conventional mass spectrometry. What this reference does disclose is 
that DNA may be tagged (by undisclosed means) with trace elements, 
10 including sulfur^ as an aid to detection of DNA« and that these trace 
elements may be detected by a variety of means, including mass 
spectrometry, 

• Details of DNA sequencing are found in Current Protocol In Molecular 
Biology, John Wiley & Son, N.Y,, N.Y^ F. M. Ansubel, et aW eds^ (1987), 
15 Chapter 7 of which is hereby incorporated by reference. Smith, et aL, 
AnaL Chem^ 60, 438-441 (1988), describes ca^jillary zone electrophoresis- 
mass spectrometry using an electrospray ionization interface and is hereby 
incorporated by reference. 

SUMMARY OF THE TNV ENTTON 
20 This invention relates to improved methods for sequencing DNA, DNA 

fragments, or other polynucleotides. The invention includes apparattis, 
reagents and mixtures of reagents for carrying out the method. In 
particular, this invention relates to the use of mass spectrometry to 
identify the terminal nucleotide of a polynucleotide, based upon the 
25 presence of a specific stable nuclide marker in the terminal nucleotide or 
the polynucleotide fragment containing^ that particular terminal iiucleotide. 
The invention offers niunerous advantages over * previous methods of 
sequencing polynucleotides, including greater sensitivity, increased signal 
specificity, simplified manipulation and safer handling. 
30 BRIEF DESCRIPTION OF THE FIGURES 

FIGURE 1 shows a schematic diagram of a complementary DNA 
sequence attached to a primer DNA sequence and a typical series of chain 
terminated polynucleotide fragments prepared according to Scheme E herein. 
FIGURE 2A shows in combination a column for separating DNA 
35 sequences according to size and a means for sequentially transporting DNA 
. sequences to a mass spectrometer. 
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FIGURE 2B shows superimposed "ion current vs. time* printouts for 
^^S02, ^Oj, ^SOj, and ^^SOj, resulting from combustion of, a chain 
terminated DNA sequence, 

RT^TEF DESCRIPTION OF THE INVENTION 
5 This invention relates to methods, reagents, apparatus and intermediates 
involved in the determination of natural or artificially made ("recombinant") 
DNA sequences and fragments thereof. This invention seeks to eliminate 
numerous deficiencies in the prior art by embodying greater convenience, 
less chromatographic ambiguity, greater sensitivity and safer handling than 

10 existing procedures. In particular, this invention involves the determination 
of DNA sequences using a combination of chain termination DNA sequencing 
techniques and mass spectroscopy. Thus, in a typical chain terminating 
DNA sequencing determination such as taught by Sanger, ct aL, Proc. Natl. 
Acad. Sci. USA, 24. 5463, (1977) involving a DNA primer, 

15 deoxynucleotidctriphosphates, didcoxynucleotidetriphosphates in the presence 
of a DNA polymerase, such as Klenow fragment^ are used to determine the 
DNA sequence. However, in embodiments of the present invention the DNA 
primer, the deoxynucleotides or the dideoxynucleotides arc labeled with 
isotopes detectable by mass spectrometry to determine the DNA sequence. 

20 For example, if the dideoxynucleotides (A, G, C, T) triphosphates, 
abbreviated as ddATP, ddGTP, ddCTP and ddTTP respectively, are labeled 
with isotopes of different masses respectively, and chain terminated 
fragments corresponding to those fragments are separated and analyzed by 
mass spectrometry, a direct reading of the DNA sequence is obtained. 

25 Generally, the labeled component of each dideoxynucleotidc component of 
the chain terminated DNA sequence is converted to a more convenient 
species for mass spectrometry determination, Le. sulfur isotopes are oxidized 
to sulfur dioxide. If the DNA primer or deoxynucleotides are labeled, 
reactions between specifically labeled deoxynucleotides must be first carried 

30 out in the presence of a specific dideoxynucleotidc. This is necessary so 
that a specific label is associated with a specific chain terminated DNA 
sequence. Once the individual reactions are conducted, the chain 
terminated DNA sequences can be mixed, separated, and analyzed by mass 
spectrometry because there will then be a specific relationship between a 

35 specific isotope and the terminal dideoxynucleotidc. This invention is much 
more sensitive than existing systems and therefore is especially useful in 
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detcrmiiiing the sequence of small quantities of DNA which are 
contaminants in products . resulting from fermentation and other 
biotechnology related processes, Le. for "screening" applications. The 
invention also includes reagents and analytical instruments for carrying out 
5 the above methods as well as intermediate mixtures of chain terminated 
DNA sequences produced while carrying out the methods of the present 
invention. 

DETAII.ED PESCRTPT TON OF THE INVENTION 
This' invention relates to an improved method for sequencing 
10 polynucleotides using mass spectrometry to determine which of the four 
bases (adenine* guanine, cytosine or thymine) is a component of the 
terminal nucleotide. In particular, the present invention relates to 
identifying the individual nucleotide^ in a DNA sequence by the mass of 
stable nuclide markers contained within either the dideoxymxcleotides, the 
15 DNA primer, or the deoxynucleotides added to the primer. The invention 
also includes reagents and analytical instruments for carrying out the above 
methods as well as mixtures of chain terminated DNA sequences. 

Information regarding the identity of the terminal base in a particular 
fragment may be signified by using a unique isotopic label for each of the 
20 four bases. The determination of which isotope marker is present, and thus 
which terminal base a fragment contains, can then be readily accomplished 
by mass spectral methods. Detection of ions by mass spectra is perhaps the 
most sensitive physical method available to the analytical chemist, and 
represents orders of magnitude better sensitivity than optical detection of 
25 fluorescence. 

If stable isotopes are chosen for labeling, then the isotope ratios are 
fixed by the mode of synthesis. The group of suitable atomic ^nuclide") 
markers include those from carbon ("C / ^^C). chlorine (^^Cl / ^^Cl), 
bromine (^Br / «^Br) and sulphur (^^s / / / ^^S). Since sulfur, 
30 chlorine and bromine are not normal constituents of DNA, i.e. they are 
"foreign", analysis for those foreign isotopes does not require consideration 
of their natural abundance ratio. It is noted that sulfur is unique among 
this group in that it alone contains four stable isotopes, each of which can 
be used to represent one of the four nucleotide bases. 
35 Further, if the fragments are subjected to combustion, then a light 

volatile derivative of the marker atom can be detected. Combustion 
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converts DNA to the oxides of carbon, hydrogen, nitrogea and phosphorus. 
The inclusion of a combustion step enormously simplifies the detection of 
trace atoms because it eliminates the problem of producing and analyzing 
high mass molecular ions. 

5 With sulfur, combustion of the polynucleotide fragments in a hydrogen- 

oxygen flame or pyrolysis tube will yield sulfur dioxide (SO,). Thus, the 
. terminal base of the fragment may be identified by determining the mass of 
the SO, ion as 64. 65. 66. or 68. This is a simple distinction by existing 
mass spectral devices using either quadrupole or permanent magnet 

10 analyzers. For a permanent magnet device, a set of four permanently fixed 
ion detectors can be mounted to continuously monitor the individual ion 
currents. A quadrupole analyzer with a single ion-multiplier detector is 

presently preferred. 

There are numerous ways in which a marker isotope could be 

15 incorporated into the complementary DNA fragment. These include 
substituting the marker isotope on the pyrimidine. purine, or ribose 
moieties, or the phosphate bridges between individual nucleotides. Further, 
the marker isotope may be contained in part of the cDNA primer, randomly 
incorporated along the chain in one or several of the deoxy-base units, or 

20 specifically in the terminal dideoxy residue. The only restriction is that 
the particular substitution be unique for that particular set of fragments. 

The site for the stable sulfur label is most preferably the phosphate 
bridge, using labeled thiophosphate in place of ordinary phosphate. The 
technique for inserting a stable thiophosphate label in place of ordinary 

25 phosphate is similar to that employed in conventional «S radiolabeling 
experiments. The chemistry and cnzymology of the polymerase reaction 
using deoxynucleotidc«^-thiotriphosphates have been investigated extensively. 
Any future developments in cloning vectors or polymerase enzymes should 
also be able to utilize the thiophosphate derivatives of the present 

30 invention. 

If the isotope label is to be incorporated into the cDNA primer or 
randomly along the chain as a deoxy-base surrogate, then it is necessary to 
perform a separate polymerase experiment with each of the appropriate 
didcoxy-base residues prior to mixing and chromatography. The advantage 
35 of using primer or intra-chain labeling is that several atoms of the marker 
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isotope may be incorporated per mole of DNA fragment, and thus enhance 
detection sensitivity. 

If. on the other hand, the isotope label is contained in- the dideoxy 
base itself, then it is not necessary to perform individual polymerase 
experiments. Instead, a mixture of the four dideoxy bases, eich with a 
unique isotope label, together with a mixture of the four normal dcoxy 
bases in stoichiometric ratios appropriate for the specific polymerase 
enzyme could- be used to generate the complete set of labeled fragments in 
a single polymerase experiment. Each fragment, regardless of its size, win 
contain one atom of the marker isotope on its terininal (dideoxy) nucleotide, 
wherein the marker isotope, would indicate the identity of the terminal 
nucleotide. • 

Schemes A and B below illustrate typical sulfur and halogen labeling 
respectively, 

15 wherein, an asterisk (*) is used herein to indicate the presence of en 

isotopic label in accordance with the present invention; 

wherein by A. T, G. and C is meant the bases adenine, thymine, 
guanine, and cytosine respectively; 

wherein hy "Alk" is meant straight or branched chain lower alkyi of 

20 1-6 caxbon atoms; ' 

wherein by "S*" is meant a sulfur ispiope of the group consisting of 
S2S. Ms and ^^S with the proviso that each isotope be uniquely 

associated with a member of the group consisting of A,. T. G, and C 
respectively; and 

25 wherein by -X*" is meant a "halogen- isotope of the group consisting 

of ssci, s^ci, 79sr a^d «Br with the proviso that each isotope be uniquely 
associated with a member of 'the group consisting of A, T. G and C 
respectively. 
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Scheme A 

Sulfur Labeling 

(a) at the phosphate bridge of "Rib": 
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SCHEME B 

Chlorine/Bromine Labeling 

(a) On the deoxyribose 



0 0 0 




(b) On the base component 
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Labeling Schemes C» D, ahd E below show three ways in which 
specific Isotopes, designated as *1, *2, *3» and *4 can be uniquely 
associated with specific teiminal nucleotides in a terminated complementary 
DNA (cDNA) sequence. For convenience in Schemes C-E, the TP" 
5 designation for the deoxy and dideoxy triphosphates has been deleted. In 
Schemes C and D, the dideoxy chain terminating reaction is conducted 
separately and then the terminated chains are mixed prior to separation. In 
particular. Scheme D is a modification of the chemical cleavage procedure 
of Gish and Ecicstein, Science, 240> 1520 (1988) whereby the DNA fragment 

10 undergoes selective, alkaline cleavage adjacent to the phosphochioate 
linkage, leaving a labeled deoxy compound as the terminat nucleotide in the 
fragment. In Scheme D, one creates a series of such fragments which 
differ from one another solely in size, via the presence of an additional 
terminal nucleotide. Identification of each terminal nucleotide (via each 

IS isotopic marker) in relation to size of the fragment provides the base 
sequence of the DNA or polynucleotide of interest. 

In Scheme E, a mixture of the four individually labeled 
dideoxynucleotide triphosphates, together with a mixture of the four deoxy 
nucleotide triphosphates are reacted together with the primer ("P") in a 

20 single reaction.- Because only one reaction and one separation need be run 
in Scheme E, it can be readily seen that the labeled dideoxy scheme. 
Scheme E, is the preferred method of the present invention. Particularly 
preferred reactants in Scheme E are the labeled dd(A*, C*, G*, or T*) 
triphosphates where the labels ^^S, ^S, ^ and ^ replace a phosphate 

25 oxygen as shown in Scheme A. 

SCHEME C 



PA-i<iN,dN dd(A) 

t'c-t^^HjLN. . ; , .dd(C) 

pQ^zdNjibi dd(G) 

Px-tdN.dN dd(T) 



.1. 






d(A,C,G.T) + dd(A) 




2. 






^ dd(C) 




3. 




+ 


^ dd(G) 




4. 
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Labeled Pyyi^Y 

1. P + d(A,C,C,T) + d(A»*) + <ld(A) -> 

2. ■+ - ♦ d(C*«) 1- dd(C) -> 

3. " + " ♦ d(G**) ♦ ddf G) »> 

4. " f - ♦ dCT**) - dJ(T) -> 



P. .dN.dA'^dN. . .dd(A) 
P. .dN.dC-'dN. . .dd(C) 
P. .dN_dG*«dN, . .dd(G) 
P. .dN.dT*<dN. . .dd(T) 



15 



T aheled Dldeoxv 

1. P + d(A,C.G,T) ♦ dd(A«»,C«».G*» T-«) -> 



P. .dN.dN dd(A"i) 

P. .dN.dN dd{C»') 

P. .dN.dN dd(G**) 

P. .dN.dN dd(T*S 
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Accordingly, the invention not only includes reagents but also a 
mixture of unique isotopically labeled dideoxynucleotide triphosphates 
(ddC*TP, ddG*TP, ddT*TP and ddA*TP) where each dideoxynucleotide 
5 triphosphate is labeled with a different sulfur or halogen isotope. In 
particular, sulfur labeling, consisting of the isotopes ^*S, ^S,. and 
is preferred. 

The invention, also includes the intermediate mixture of dideoxy chain- 
terminated DNA sequences in which each chain-terminated DNA sequence 

10 contains an isotope measurable by mass spectrometry and in which each 
isotope relates to a specific chain-terminating dideoxynucleotide- The 
stable nuclide marker may be incorporated into the DNA primer, the DNA 
chain extended from the primer, or the dideoxynucleotide which terminates 
the DNA chain extended from the primer. The invention also includes the 

15 mixture of isotope-labeled, chain-terminated DNA sequences separated by 
size. 

Although this invention has been discussed in terms of sequencing 
DNA, the method and the reagents of the present invention could also be 
used to sequence RNA by providing reverse transcryptase as the polymerase, 

20 Figure 1 illustrates a complementary DNA sequence attached to a 

promoter DNA sequence and typical series of chain . terminated 
polynucleotide fragments prepared according to Scheme E from mixtures of 
deoxynucleotide triphosphates and labeled dideoxynucleotide triphosphates.. 
These labeled fragments illustrate labeled chain terminated complementary 

25 DNA sequences 1 prepared by the method of the present invention, wherein 
the size of each complementary DNA fragment corresponds to the relative 
position of that fragment's terminal nucleotide in the overall complementary 
DNA sequence. These labeled fragments sequences are separated by size by 
an electrophoresis column 2. The fragments from the electrophoresis 

30 column 2 are sequentially eluted to a detector 3. Figure 2 A shows in more 
detail the apparatus for determining a DNA sequence. DNA sequences are 
prepared in reaction chamber 4.- The mixture of labeled terminated DNA 
fragments are separated according to size by electrophoresis on a 
polyacrylanoude gel column 5 wherein migration occurs from the cathode 

35 (V*) to the anode (V+). The fractions are taken off the polyacrylamide gel 
column 5 sequentially by size at transfer point 6 where is provided a means 



# 
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7 for transferring the terminated DNA fragments to an oxidizer or 
combustion chamber 8. In the oxidizer or combustion chamber 8, the sulfur 
label is oxidized to SOj and the labeled SO, is detected in a mass 
spectrometer 9. Figure 2B shows typical superimposed "ion current v. time" 
5 plots for m/e 64, 6$, 66, and 68, corresponding to ions produced by the 
four stable isotopes of sulfur, Le.. ^^so^, ^SO^, ^SO^ and ^^O^, 
respectively. When the stable isotopes of sulfur associated with the bases, 
A, C, G and T arc ^^S, ^S, and respectively, a plot corresponding 
to the DNA sequence illustrated at the top of Figure 2B is obtained. In 

10 this manner, the DNA sequence of any genetic material can be determined 
automatically and on femto- to nanomolar quantities of material. 

There are several variations in the design of an automated DNA 
sequencer of the present invention. The major components of the device 
are the reaction chamber for conducting polymerase reactions, the 

15 chromatographic device consisting of some form of electrophoresis, the 
effluent transport, the combustion system, and the mass spectral analyzer. 
Because this instrument is designed to operate on femto- to nano-molar 
quantities of DNA, it is important that the geometry of all component 
systems be kept to a minimum size. 

20 The chromatographic system may be of a lancd plate or tubular 

configuration. In the plate designs, the supporting medium for the 
chromatographic separation will be most preferably a polyacrylamide gel, 
where the ratio of acrylamide to bis-acrylamide is more preferably between 
10:1 and 100:1. Although persulfate is the typical polymerization catalyst 

25 used by most workers to prepare polyacrylamide gel plates, the background 
of sulfate ions may be unacceptably high without extensive washing. 
Ultraviolet irradiation can be used successfully to initiate cross-linking and 
produce high quality gels, which can be used immediately without washing. 
For tubular designs, the chromatographic separation may be conducted 

30 in a gel-filled capillary or in an open tubular configuration. The preferred 
dimensions of the capillary depend on whether an open or gel-filled medium 
is selected. For gel-filled devices, preferred diameters are 50 to 300 
microns. In open tubular configurations, however, the preferred diameters 
are 1 to 50 microns. The preferred length of the capillary depends on the 

35 diameter and the amount of DNA sample which will be applied, as well as 
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the field strength of the applied elcctrophoretic voltage. The preferred 
length is optimally between 0^5 and 5 meters. 

For open tubular configurations, the capillary will preferably be 
fabricated from fused silica. Under typical operating conditions, where pH 
of the buffer is usually maintained in a range of 5.0 to 11.0. the surface of 
the silica will have a net negative charge. This surface charge establishes 
conditions in which there is a bulk electrosmotic f low of buffer toward the 
negative electrode. The DNA fragments also possess a negative charge and 
arc attracted to the positive electrode. Hence, they win therefore move 
more slowly than the bulk electroosmotiG flow. In geI=fiHcd devices, the 
supporting medium minimizes electroosmosis. Since the gel has no charge, 
the negatively charged DNA fragments migrate toward the positive 
electrode. 

There are a variety of techniques to modify the surface charge on the 
15 wall of an open capillary. One particularly useful method is to covalently 
modify the wall with a monomer such as methacryloxypropyltrimethoxysilane. 
This monomer can then be crosslinked with the acryfamidc to give a thin 
bonded monolayer which is similar in characteristics to the polyacrylamide 
gel-fiUed capillaries. The distinct advantage of the coated wall method is 
20 that the capillary can be recycled after each analysis run simply by flushing 
with fresh buffer. (See S. Hjerton,, J. Chromatographyi 2£L 191. il9Z5) for 
details). 

The transfer system is selected to match the particular 
chromatographic design. The chromatographic system may be of laned plate 
25 or tubular configuration. It is desirable to have the chromatography 
effluent in a closed environment. The tubular configurations may be more 
amenable to sample transfer designs which pump, spray or aspirate the 
column effluate into the combustion chamber, and thus minimize degradation 
in resolution because of post-chromatographic remixing of fractions. 

For the open plate type devices, a moving belt or wire system can be 
used satisfactorily. In this system, a thin coating of the column effluent is 
spread on the ribbon to transport the elutcd fractions through a pre- 
drying oven and then into the combustion furnace. The transport ribbon 
may be fabricated from platinum or other noble metal, and may be 
35 continuously looped because the ribbon can be effectively- cleaned upon 
passage through the combustion furnace. Less preferably, the ribbon may 



30 
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25 



30 



be fabricated from a glass or ceramic fiber or carbon steel. In this design^ 
the ribbon would be taken up on a drum for disposal. 

For tubular configurations, the most preferred embodiment of the 
transfer system is to use an electrospray nebulizer method to create a fine 
aerosol of the column effluent In this technique, a small charge of 
optimally less than 3000 volts is applied to the emerging droplet. The 
charge of a larger droplet tends to disperse it into a very fine mist of 
singly charged droplets* These fine charged droplets can be focused and 
directed by electric or magnetic fields, much as in an ink-jet printer head. 
It is important to control the temperature of the flowing gas stream into 
which the aerosol is introduced. If the temperature is too great, then the 
droplets will tend to evaporate on the capillary injector. If the 
temperature is too low to overcome the surface tension of the effluent, the 
individual droplets will not be adequately dispersed. The composition and 
ionic strength of the supporting electrolyte is important. The preferred 
buffers are phosphate or tris-acetate at less than 0.1 M concentration. 
Because the overall method is based on detection of trace atoms, it is 
critically important that the buffers be free of contaminant ions, such as 
sulfate. 

A second satisfactory method to create an aerosol of the column 
effluent is an ultrasonic device which produces sufficient mechanical shear 
and local heating to disperse the droplet. A similar type of shear may also 
be generated off the tip of a capillary injector into a venturri type 
aspirator, where the flow of supporting gas is developed by the pressure 
differential into the mass spectrometer. In these designs, it is often 
desirable to add small amounts of additional aqueous or organic solvents at 
the tip in order to aid in the flash evaporation of the effluent. 

The combustion section is designed to completely burn the vaporized 
or surface evaporated chromatography effluent of DNA fragments together 
with the supporting electrolytes and optional solvent modifier to the oxides 
of carbon, hydrogen, nitrogen, phosphorous, and most importantly, those of 
the marker isotope. The sensitivity of detection of the marker isotope is 
greatly affected by the efficiency of combustion, since low molecular weight 
fragment ions which may result from incomplete burning can mask the 
presence of the primary detection species. The sequencing preparation must 



wo 89/12694 



PCT/US89/02602 



-20- 

bc free of other ions in the mass/charge range of 64 to 68, since such ions 
would interfere with detection of the isotopic sulfur dioxide. 

The combustion may be accomplished at moderate to approximately 
atmospheric pressure prior to injection into the nmss spectrometer. ' For 
5 sulfur containing streams^ essentially complete combustion to sulfur dioxide 
will be achieved when the sample is heated to temperatures in excess of 
approximately 900*^0 in an oxygen environment. 

The most nigged design is to simply aspirate the colunm effluent iato 
a hydrogen-oxygen flame, similar in design to standard gas chromatography 
10 flame ionization detectors. The important characteristics of the flame, the 
temperature and sample residence time, will be determined by the ratio of 
hydrogen to oxygen* the aggregate flow rate of gases and the local 
pressure. The characteristics of sulfur containing flames at 100-150 tor 
have been described by Zachariah and Smith, Combustion and Flame, ^ 125 
15 (1957). A limitation to the sensitivity of this design is the volume of gas 
(water vapor) resulting from hydrogen-oxygen combustion which effectively 
dilutes the sulfur dioxide. Although the standard mass spectrometry 
techniques such as gas separators or semipermeable membranes may be used 
to remove water vapor, there is a trade-off between sample dilution and 
20 ultimate detectability which must be considered for each design, 

A preferred method to very efficiently bum the nebulized colunm 
effluent is to inject it into a short heated tube in an oxygen environment. 
The tube may be constructed of nobel metals such as platinum, ceramic or 
quartz, depending on the method of heating. The external heating action 
25 may be provided by a cartridge electrical resistance heater or an external 
flame. The tube may be packed with a heat exchanger medium such as 
. glass wooL Optionally, a catalytic surface may also be provided by such 
materials as supported platinum or copper oxide to enhance combustion 
efficiency. 

30 • A particularly effective method to burn the sample is in an inductively 

coupled* oxygen plasma, where the tube forms the resonant cavity of a 
microwave generator. The inductively coupled plasma techniques have been 
reviewed by G. Meyer, Anal. Chem,, 52. 1345A (1987). 

Alternatively, the combustion may be affected within the low pressure 

35 environment of the mass spectrometer. A standard ionization technique in 
mass spectrometry is fast atom bombardment. An energetic beam of atoms. 
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usually xenon, is produced in a plasma torch and directed toward the 
sample. Ionization occurs by collision induced dissociation of this beam 
with the sample. If instead of pure xenon, oxygen is introduced into the 
fast atom beam, then both oxidation and ionization of the sample can occur. 
The limitation of this method, however, is the difficulty of achieving 
quantitative oxidation, and thus minimizing the background signal from 
incompletely oxidized low mass fragments. 

There are several methods to effect ionization of sulfur dioxide. The 
objective is to obtain as high ion current as possible. In designs which 
operate at atmospheric pressure by flame, corona discharge needle or 
microwave induced plasma discharge, the ionization efficiency will be very 
high. In this type of design, a portion of the ionized gas is introduced 
into the low pressure region of the mass analyzer through a small sampling 
orifice or skimmer cone. The size of the orifice, and thus the percentage 
15 of total combustion sample which can be introduced, will depend on the 
pumping speed of the vacuum system. Generally, less than five percent of 
the sample will be transferred to the analyzer region. Designs of this type 
have been described by T. Covey, Anal. Chenu Si. 1451 A (1986) and by G. 
Hicftje, Anal. Chem., 12. 1644 (1987). 
20 Alternatively, the sample may be ionized in the lower pressure region 

near the analyzer. This may be achieved by such methods as electron 
impact using the beam emanating from a hot filament, by fast atom 
bombardment with inert gases such as xenon, or by chemical ionization with 
a variety of light gases. In these types of design, although the ionization 
25 efficiency is low relative to atmospheric methods, a greater percentage of 
those ions actually get to the analyzer section. The electron impact 
techniques have been described by A. Bandy, AnaL Chem., 5£. 1196 (1987). 

An RF-only quadrupole mass filter may be used to help separate low 
molecular weight combustion products (HjCNj and COj). 
30 The analyzer and ion detector sections can be selected from several 

commercially available designs. The analyzer may be a quadrupole device 
where mass selection depends on the trajectory in a hyperbolic field, a field 
swept electromagnetic device with a single ion detector, or a permanent 
magnet device with an array of four ion detectors tuned to the isotopes of 
35 interest. The detector may be of single stage or ion multiplier design, 
although the latter type is preferred for highest sensitivity. 
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Whcn the mass spectrometer is being used to detect isotopes of sulfur- 
dioxide, as would be the case when dideoxy terminated thiophosphates arc 
being utilized, the very high sensitivity is achieved when the polarity of 
the spectrometer is set to determine the positive ion spectrum. However^ 
5 in glow discharge ionization, highest sensitivity is achieved when the 
spectrometer is operated in the negative ion mode. 

When the .mass spectrometer is being used to detect isotopes of 
chlorine or bromine, then maximum sensitivity will be achieved when the 
spectrometer polarity is reversed, and the negative ion spectrum is detected, 

IQ * The labeled compounds of the present invention are prepared by 

conventional reactions employing commercially available isotopes. For 
example, the sulfur isotopes: ^^S> ^ and are commercially 

available as CS^ or HjS from the Department of Energy (Oak Ridge, Tenn. 
. or Miamisburg, Ohio) at *isotopic enrichments" of 99,8%, 90.8%, 943% and 

15 82.2%, respectively. Although the method of the present invention will 
provide satisfactory results * with the "isotopically enriched" commercial 
products, it is preferred that the- sulfur isotopes be at 99.5% enrichment to 
facilitate interpretation of the ion current v. time plots* 

Enrichment techniques for sulfur are well known in the art. In 

20 particular, CS^ can be further enriched by fractional distillation which 
takes advantage of the different boiling points of CSj conferred by the 
various sulfur isotopes. Alternatively, gaseous diffusion of SFg also can 
provide further enrichment of the sulfur isotope of interest. Thereafter, 
the isotopically enriched CSy*, H2S* or SF*^ are converted into the 

25 reagents described herein by techniques well known in the art. 

The "halogen" isotopes ^Cl, ^^Cl, ^r and ^^Br are also commercially 
available from the Department of Energy in either elemental form or as the 
corresponding halide salt at enrichments of 99%, 95%, 90% and 90%^ 
respectively. These isotopes are used herein in their commercially available 

30 form. 

As shown in Scheme F, labeled 2\3'-dideoxynucleotide-5-(0-l- 
thiophosphates) III are prepared by initially reacting a 2\3^- 
dideoxynucleosidc I with isotopicaay enriched thiophosphoryl chloride 
(PS^Cls) in triethyl phosphate, wherein by "S*" is meant a sulfur isotope 
35 that is a member of the group consisting of ^^S. ^S, and ^^S in 

isotopically enriched form. From the above reaction, the correspondingly 
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2*,3'-didcoxynuclcotidc-5MO-l-t*iiotriphosphatc) III is prepared from II by 
dissolving the bistricthylamine (TEA) salt of 11 in dioxanc and reacting it 
with diphenyl phosphochloriodatc to form the diphcnyl phosphate ester of 
II. This phosphate ester was further reacted with the tetrasodium salt of 
5 pyrophosphate in pyridine to form IIL Purification of III is accomplished 
by chromatography on diethylaminoethyl (DEAE) cellulose. 

Scheme G shows the general method for preparing 3'-halo-2\3'- 
didcoxynucleosidcs V wherein the halogen ("X*") is a member of the group 
consisting of ^Cl. *^a, and •^Br in isotopically enriched form. In 

10 particular, a solution of l-(5-0-triphcnylmethyl-2-deoxy- ^ -D- 
threopentofuranosyl)nucleoside wherein by nucleoside is meant A, T, G, or 
C, in a basic solvent, such as pyridine, was reacted with mcthancsulfonyl 
chloride. The resulting mesylate was treated with an isotopically enriched 
salt, such as LiX*, in the presence of heat and then acidified to produce a 

15 3'-halo-2\3'-didcoxynucleosidc V. Compounds of Formula V can be 
converted to the corresponding labeled nucleotide monophosphate VI by 
reaction with cyanoethyl phosphate and dicyclohexylcarbodiimidc (DCC) 
followed by LiOH deblocking. 

The corresponding triphosphate of V is prepared from the 

20 monophosphate as described from the monophosphate as described for the 
conversion of II to III above. 

Scheme H presents a method for halogenating a purine or pyrimidine 
base of a 2\3*-dideoxynuclcosidc VII using isotopically enriched elemental 
bromine (^^Br or ^^Br) or chlorine {^^C\ or ^^Cl), i.e. Xj* The 2%3'- 

25 dideoxynucleoside Vn is dissolved in a polar solvent, such as dry DMF, in 
the presence of a base, such as pyridine. To the reaction mixture is added 
a molar equivalent of the elemental halogen (X^*) and the reaction mixture 
is allowed to stir for 12 hours. Evaporation of the solvent produces the 
labeled 2%3*-dideoxynuclcoside VIII wherein the isotopic halogen label is on 

30 the purine or pyrimidine base of the dideoxy nucleoside. Purification is 
accomplished by conventional chromatographic techniques. 

The labeled 2*,3*-dideoxynucleoside VIII is converted to the 
corresponding monophosphate and triphosphate as discussed above for VI 
and III, respectively. 

35 The examples described herein are intended to illustrate the present 

invention and not limit it in spirit or scope. 
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SCHEME G 





cyanoathyl 
phosphaM 



dicyclohexylcarbodiimide (DCC: 



0 

ll 

H — O — P— O- 

I 

O- 



(A, T, G, or C) 




'A 



X* H 



VI 



wo 89/12694 



PCr/US89/02602 



-26- 




(A# T# 0# oc O 
I 



H H 



H 



Vt 



X^* pyridine 



E— O 




H R 



CA,T,G.rOrC) 



VI 



cyahoethyl 

phosphate 



dicyclohexylcarbodiintide (DCC) 



O 
l( 

K — O — P— Q- 

I 



H E 



« * • 

(A , T , G or C ) 



vin 



wo 89/12694 



PCr/US89/02602 



-27- 

EXAMPLE 1 

Preparation of 
f33s;i2\3^-D!deoxvadeno5ine-S*-PhosDhorothioate 

2'3'-Didcoxyadcnosiiie (47 mg, OJ, mmol) was suspended in triethyl 

5 phosphate (0.5 ml) and heated briefly to lOO^^C The solution was cooled to 

4^C and treated with [^^S] thiophosphoryl chloride (37 mg, 0.22 mmol). 

The mixture was agitated for 12 hr. at 4<*C, and then treated with 2 ml 10% 

barium acetate and agitated at 20®C for 1 hour. The suspension was 

treated with 0.5 ml triethyl amine and then with 5 ml 95% ethanoL The 

10 suspension was agitated for 30 min. and then filtered. The precipitate was 
washed with 50% aqueous ethanol and then water. The filtrate was 
evaporated to dryness, and tlie solid taken up in water and 
chroma tographed on a column of diethylaminoethyl (DEAE) cellulose which 
had been equiUbrated with NH4HCO3. The column was eluted with O.IM 

15 NH4HCO3 and the fractions adsorbing at 260 nm were pooled and 
evaporated. The solid was evaporated twice with 80% ethanol. twice with 
80% ethanol containing 2% triethyl amine (TEA), and finally with anhydrous 
ethanoL There was obtained 44 mg of the bis-triethylamine salt of the 
title product. A solution of the triethylamine salt in 1 ml methanol was 

20 treated with 1 ml of a solution of 6M Nal in acetone. The precipitate was 
washed with acetone and dried to give 32 mg of disodium salt of the title 
product as a white solid. 

EXAMPLE 2 
Preparation of 

25 r^S12'.3*-Dideoxvadenosine-5*-fQ-l-Thiotr iphosphate) 

A solution of the bis-triethylamine (TEA) salt of the title product of 
Example 1 (26 mg, 0.05 mmol) was dissolved in I ml dry dioxane and 
treated with diphenyl phosphochloridate (0.015 ml, 0.075 mmol). The 
mixture was agitated for 3 hr. at 25**C A solution of dry pyrophosphate in 

30 pyridine was prepared by dissolving the tetrascdium salt (220 mg, 0.5 mmol) 
in 3 ml pyridine and evaporating twice, and the taking up in 0.5 ml 
pyridine. This solution was added to the above solution of the crude active 
ester, and stirred for 2 hr. The crude product was precipitated by addition 
of ether (10 ml). The precipitate was dissolved in water and 

35 chromatographed in DEAE-cellulose eluted with O.IM triethylammonium 
bicarbonate. The pooled fractions contained 150 Ajgo units (20%) of the 
title product. The solution was lyophilized and the residue stored at -70®C. 
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EXAMPLE 3 

Preparation of 
r^Bri3^-Broino-2\3'-Deoxvthvnridhie 

A solution of l-(5-0-triphenylmetliyl-2*-deoxy- |3 -D- 

5 threopentof uranosyOthymine (50 mg, 0,1 zmnol) in pyridine (Iml) was treated 

with methanesulfonyl chloride (0i)14 ml, 0.12 mmol) and the reaction 

agitated at lO^'C for 6 hr. The mixture was then evaporated, diluted with 

CHCls (5 ml) and washed 2x with water. The organic layer was evaporated, 

and the crude mesylate dissolved in dry diglyme (1 ml) and treated with 

10 £Li^r] (17 mg, 0^ mmol). The solution was heated for 4 hr. at lOO^'C; 
and then diluted with 80% acetic acid (1 ml) and heated 15 min. longer. 
The reaction was cooled and diluted with water (2 ml), and extracted 3x 
with chloroform (2 ml). The organic extracts were evaporated and the 
residue chromatographed on silica gel eluted with 95:5 chloroform methanol 

15 to bWc 18 mg of the title product as an off-white solid. 

This material cotild be converted to the triphosphate via the 
monophosphate as in Example 2. The monophosphate was prepared by 
reaction with cyanoethyl phosphate and dicyclohexylcarbodiimide (DCC), 
followed by LiOH deblocking. 

20 ' EXAMPT.E ^ 

Preparation of 
r^r!2\3*-Pideoxv-g-Bromocvt!dfn& 

To a solution of 2*,3'-dideoxycytidine (60 mg, 03 mmol) in dry DMF 

(1 ml) was added 0.1 ml pyridine and then [^BrJ bromine (42 mg, 

25 0.3 mmol), and the mixture agitated for 12 hr. The solvent was evaporated, 

and the residue chromatographed on silica gel (ethyl acetatermethanoktri- 

ethylamine 90:10:1) to give 46 mg of the title product as a white solid. 

This material could be converted to the triphosphate via the 

monophosphate as in Example 2. The monophosphate was prepared by 

30 reaction with cyanoethyl phosphate and dicyclohexylcarbodiimide followed by 

LiOH deblocking. 
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What is Cl aimed Is; 

1. A method for determining the base sequence of natural or 
artificially made DNA or RNA or iragments thereof, a step in said method 
comprising: 

determining the identity of the bases in said sequence by 
5 identifying the unique stable foreign isotope associated with each variety of 
base. 

2. In a method to determine a DNA sequence by the 
dideoxynucleotide chain termination method, the improvement comprising 
identifying the terminal nucleotide of the chain terminated DNA sequences 
bv measuring by mass spectrometry an isotope specifically incorporated in 

5 the chain terminated DNA sequence, whereby the mass of the incorporated 
isotope or its oxidation or combustion product corresponds with the identity 
of specific terminal nucleotide in each said terminated DNA sequence. 

3. A method according to aaim 2 wherein the isotope is 
incorporated in a dideoxynucleotide. 

4. A method according to Claim 2 wherein the isotope is 
incorporated into a dcoxynucleotide. 

5. A method according to Claim 2 wherein the isotope is 

incorporated into a DNA primer. 

6. An improved method for sequencing DNA wherein deoxynucleotide 
triphosphates, consisting of dTTP. dATP. dCTP. and dGTP, and 2.3- 
dideoxynucleotide triphosphates, consisting of ddATP, ddCTP. ddGTP. and 
ddTTP. are reacted with a DNA sequence in the presence of a DNA primer 

5 which binds to the DNA sequence and a DNA polymerase; whereby 
complementary DNA sequences arc formed by the addition of complementary 
nucleotides to the primer and are terminated by dideoxynucleotide 
triphosphate incorporation providing a terminated complementary DNA 
sequence corresponding with each of the bases in the DNA sequence to be 

10 determined and separating the resulting terminated DNA sequences according 
to size; the improvement comprising; 

labeling ddATP, ddCTP, ddGTP, and ddTTP respectively with 
stable isotopes of different mass, and 

detecting the isotope of different mass in each of the terminated 

15 DNA sequences by mass spectroscopy so as to identify the terminal 
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nucleotide in each said terminated complementary DNA sequence, therebv 
determining the sequence of the DNA; 

whereby when the identity of the terminal nucleotide is plotted 
versus the size of the fragment, the DNA sequence of the complementary 
DNA is determined- 

7. A method according to Qaim 6 wherein ^25, asg^ «s and '^S- 
isotopes are used. 

8. A DNA sequence analyzer comprising 

(a) a means for separating chain terminated DNA sequences according 
to size; 

(b) a mass spectrometer for analyzing isotopes of different mass 
bound to the chain terminated DNA sequence in a relationship that 
associates the mass of an isotope with a terminal nucleotide of the DNA 
sequence; and 

(c> a means for transporting the chain terminated DNA sequences 
from the means for separating the chain terminated DNA sequences to a 
combustion chamber with converts the elements of chain terminated DNA 
sequence including the isotopes to oxides for analysis in the mass 
spectrometer. . 

9. A DNA sequence analyzer according to Claim 8 wherein the mass 
spectrometer containif a means for converting s^S, ^sg^ „<j ssg to 
«SO,. 33so^ «so, and a^SQ, and sequentially detecting the SO, 
resulting from the chain terminated DNA sequences. 

10. A reagent for DNA sequencing comprising a mixture of 
dideoxynucleotide triphosphates (ddCTP. ddGTP. ddTTP, and ddATP) wherein 
each dideoxynucleotide triphosphate is labeled with a different isotope 
which is measurable by mass spectrometry. 

11. A mixture according to Claim 10 wherein the isotope is sulfur 
isotope selected from "S. **s or '^S. 

12. A mixture of dideoxynucleotide chain terminated DNA sequences 
Wherein each chain terminated DNA sequence contains an isotope measurable 
by mass spectrometry which is specifically related to a specific chain 
terminating dideoxynucleotide. 

or^h. """"^ according to Claim 12 wherein the isotope is "S. 

14. A mixttire according to Claim 12 wherein at least one of the 
isotopes' is a halogen. 
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15. A mixture according to Claim 12 which is separated according to 

size. 

16. A mixture according to Claim 12 wherein the isotope is in the 
DNA primer portion of the terminated DNA sequence. 

17. A mixture according to Claim 12 wherein the isotope is in the 
dideoxynucleotidc portion of the chain terminated DNA sequence. 

18. A mixture according to Claim 12 wherein the isotope in the chain 
terminated DNA sequence originates with the deoxynucleotide triphosphate. 

19. A compound of the formula: 



O O S* 




Y H 



wiierein S* is selected from ^ S« S, S or ^^S; wherein Z is adenine, 
cytosine, guanine or thymidine; and wherein Y is hydrogen or hydroxyl. 

20. A mixture of the compounds of Claim 19 wherein one sulfur 
isotope is associated with one of A, T, G, or C and Y is hydrogen. 
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21. A reagent for DNA sequenciog according to the formula: 




I 

Bib 



where X is halogen or lower alkylthio having 1-6 carbon atoms; wherein Rib 
is 




wherein Y is hydrogen or hydroxyl; and wherein the sulfur of said alkylthio 
or said halogen is isotopically enriched and detectable by mass spectroscopy. 
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22. A reagent for DNA sequenciag according to the formula: 



H 



1> 



I 

Rib 



where X is halogen or lower alkylthio having 1-6 carbon atoms; wherein Rib 
is 



O O O 

il II ii 

HO P — O— P — 0 — P — O 

I.I I 

o- o - 




wherein Y is hydrogen or hydroxy!; and wherein the sulfur of said alkylthio 
or said halogen is isotopically enriched and detectable by mass spectroscopy. 
23. A reagent for DNA sequencing according to the formula: 
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whcre X is halogen or lower alkylthio having 1-6 carbon atoms; wherein Rib 



IS 



O' O o 

fl II II 

HO P — O — ? — O — P — o- 

i I f 

o- o - o- 




wherein Y is hydrogen or hydroxyl; and wherein the sulfur of said alkylth 
or said halogen is isotopically enriched and detectable by mass spectroscopy 
24. A reagent for J>NA sequencing according to the formula: 



10 



.1 

Rib 



therein X is halogen or lower alkylthio having 1-6 carbon aioms; wherein 
Bib is 



O O o 

f II II 




Y H 
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whcrein Y is hydrogea or hydroxyl; and wherein the sulfur from said 
alkylthio or said halogen is isotopically enriched and detectable by mass 
spectroscopy. 

25. A reagent according to Oaims 21, 22, 23 or 24 wherein sulfur is 
selected from **S, ^ and and wherein halogen is selected from 
«Cl, *^CL, ^r, and "Br. 

26. A reagent for DNA sequencing according to the formula: 



O 

It 

— p 



o o 



II !! 




eo 



o 



P — O — P— o 



o 



o - 



o- 



X 



H 



wherein Z is adenine, guanine, cytosine, or thymine; 
wherein X is »»C1, »^a, '»Br or *^Br. 
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R 



3 



.1 



DNA .TGCACATGT6CA 

CDNA, ACGTGTACACGT 



' A* 



A* 

---A* 

^ 



G* 



G* 



T* 

G* 

-c; 

A* 

C* 

A* 

J-k 

6* 

T* 

--G* 

A* 
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DNA SEQUENCE 
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66 
68 
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BACKGROUND OF THE INVENTION 
This invention is generally related to DNA . and ENA 
sequencing and, more particularly, to DNA and ENA 
sequencing by detecting individual, nucleotides. This 
5 invention is the result of a contract with the Department 
of Energy (Contract No. H-7405-ENG-36) . 

A world-wide effort is now in progress to analyze the 
base sequence in the human genoi&e. The magnitude of this 
task is apparent, with 3 x lo' bases in the human 
10 genome, and available base sequencing rates are about 
200-SOO bases per' 10-24 hour period. Considerable 
interest also exists in nucleic acid sequencing from 
non-human sources. Existing procedures are labor 
intensive and cost approximately $1 per base. 
15 By way of example, Sanger et al., "DNA Sequencing with 

Chain-Terminating Inhibitors.- Proceedings of the National 
Academy of Science. USA 74, 5463-7 (1977) provide for 
sequencing 15-200 nucleotides from a priming site. 
Radioactive phosphorus is used in the primer extension to 
20 provide a marker. Enzymatic resynthesis coupled with 

chain terminating precursors are used to produce DNA 
. fragments which terminate randomly at one of the four DNA 
bases: adenine (A), cytosine (C) . guanine (G) , or thymine 
(T) . The four sets of reaction products are s parated 
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electrophorectically in adjacent lanes of a polyacrylamide 
gel. The nigration of the DNA fragments is visualized by 
the action of the radioactivity on a photographic film. 
Careful interpretation of the resulting band patterns is 
required for sequence analysis. This process typically 
takes 1-3 days. Further « there are problems with band 
pile-ups in the gel. requiring further confirmatory 
sequencing. 

In a related technique. A.M. Maxam and W. Gilbert. "A 
New Method for Sequencing DNA. " Proceedings of the 
National Academy of Science* USA 74^ 560-564 (1977). teach 
a chemical method to break the DNA into four sets of 
random length fragments, each with a defined termination. 
Analysis of the fragments proceeds by electrophoresis as 
described above. The results obtained using this method 
are essentially the same as the **Sanger Method.^ 

In another example. Smith et al.. "Fluoresc^at 
Detection in Automated DNA Sequence Analysis." Nature 3ZL ^ 
674-679 (June 1986). teach a method for partial automatism 
of DNA sequence analysis. Four fluorescent dyes aice 
provided to individually label DNA primers. The Sanger 
method is used to produce four sets of DNA fragments which 
terminate at one of the four DNA bases with each set 
characterized by one of the four dyes. The four sets of 
reaction products. each containing many identical DNA 
fragments. are mixed together and placed on a 
polyacrylamide gel colunm. Laser excitation is then used 
to identify and characterize the migration bands of the 
labeled DNA fragments on the column where the observed 
spectral properties of the fluorescence are used to 
identify the terminal base on each fragment. Sequencing 
fragments of up to 400 bases has been reported. Data 
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reliability can b a probl m sine it is difficult to 
uniquely discern tbe spectral identity of the fluorescent 
peaice. 

These and other problems in the prior art are 
^addressed by the present invention and an improved process 
is provided for rapid sequencing of DNA bases. As herein 
described, the present invention provides for the 
sequential detection of individual nucleotides cleaved 
from a single DNA or RNA fragment. 

Accordingly, it is an object of the present invention 
to provide an automated base sequence analysis for DNA and 
RNA. 

Another object of the present invention is to process 
^long strands of DNA or RNA* i.e.* having thousands of 
bases. 

One other object is to rapidly sequence and identify 
individual bases . 

Additional objects* advantages and novel features of 
the invention will be set forth in part in the description 
vhich follows* and iti part vill become apparent to those 
skilled in the art upon examination of the following or 
may be learned by practice of the invention. The objects 
and advantages of the invention may be realized and 
attained by means of the instrumentalities and 
combinations particularly pointed out in the appended 
claims. 

SUMMARY OF THE INVENTION 
To achieve the foregoing and other objects, and in 
accordance with the purposes of the present invention* as 
embodied and broadly described herein, a method for DNA 
and E^A base sequencing is provided. A single fragment 
from a strand of DNA or RNA is suspended in a moving 
sample stream. Using an exonuclease, the end bas on the 
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DNA or RNA fragment is repetitively cleaved from the 
fragment to form a train of the bases in the sample 
stream. The bases are thereafter detected in sequential 
passage through a detector to reconstruct the base 
sequence of the DNA or RNA fragment. 

In another characterization- of the present invention, 
strands of DNA or RNA are formed from the constituent 
bases, which have identifiable characteristics. The bases 
are sequentially cleaved from the end of a single fragment 
. of the strands to form a train of the identifiable bases. 
The single, cleaved bases in the train are then 
sequentially identified to reconstruct the base sequence 
of the DNA or RNA str'and. 

In one particular characterization of the invention, 
each of the nucleotides effective for DNA and RNA 
resynthesis is modified to possess an identifiable 
characteristic. A strand of DNA is synthesized from the 
modified nucleotides, where the synthesized strand is 
complementary to « DNA or RNA strand having a base 
sequence to vbe determined. A single fragment of the 
complementary DNA or RNA is selected and suspended in a 
flowing sample stream. Individual identifiable 

nucleotides are sequentially cleaved from the free end of 
the ST?spended DNA strand. The single bases are then 
sequentially identified. The base sequence of the parent 
DNA or RNA strand can then be determined from the 
complementary DNA strand base sequence. 

BRIEF DE SCRIPTION OF THE DRAWINGS 
The accompanying drawings, which are incorporated in 
and form a part of the specification, illustrate an 
embodiment of the present invention and. together with the 
description. serve to explain the principles of the 
invention. In the drawings: 
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FIGURE 1 is a graphic Illustration of a DNA eequ ncing 
process according to the present invention. 

FIGURE 2 is a graphical representation of an output 
signal according to the present invention. 
5 DETAILED DESCRIPTIOW OF THE IWVENTIOW 

According to the present invention, a nethod is 
provided for sequencing the bases in large DNA or RNA 
fragments by isolating single DNA or RNA fragments in a 
moving stream and then individually cleaving single bases 

10 into the flow stream, forming a sequence of the bases 
through a detection device- In one embodiment, the single 
bases in the flowing sample streams are interrogated by 
laser-induced fluorescence to determine the presence and 
identity of each base. 

15 It will be understood that DNA and RNA strands are 

each formed from nucleotides compriising one of four 
organic bases: adenine/ cytosine, guanine* and thymine 
(DNA) or uracil (RNA) . The DNA and RNA nucleotides are 
similar, but not identical; however, the nucleotides and 

20 strands of nucleotides can be functionally manipulated In 

a substantially identical manner. Also, the complement of 
an RNA fragment Is conventionally formed as a DNA strand 
with thymine In place of uracil. The following 
description is referenced to DNA sequencing. but any 

25 reference to DNA Includes reference to both DNA and RNA 

and without any limitation to DNA. 

In a particular embodiment of the present Invention, 
the initial step is an enzymatic synthesis of a strand of 
DNA. complementary to a fragment to be sequenced, with 

30 each base containing a fluorescent tag characteristic of 

the base. Sequencing the complementary strand Is 
equivalent to sequencing the original fragment. The 
syntheslz d strand Is then suspended In a flowing sample 
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stream containing an exonuclease to cleaye bases 
sequentially from the free end of the suspended DNA or 
RNA. The cleaved* f luorescently labeled bases then pass 
, through a focused laser beam and are individually detected 
and identified by laser-induced fluorescence. 

The maximum rate that bases may be sequenced is 
determined by the kinetics of the exonuclease reaction 
with DNA or RNA and the rate of detection. A projected 
rate of 1000 bases/sec would result in sequencing 
8 X 10^ bases/day. This is in contrast to standard 
techniques which take 10-24 hours to sequence 200-500 
bases . 

Referring now to Figure • 1, one effective sequencing 
method comprises the following steps: (I) prepare a 
selected strand of IWA 10, in which individual bases are 
provided with an identifiable characteristic. e.g.. 
labeled with color-coded fluorescent tags to enable eaQh 
of the four bases to be identified. (2) select and suspethd 
40 a single fragment of DNA with identifiable bases in a 
flowing sample stream, (3) sequentially cleave 20 the 
identifiable bases from the free end of the suspended DNA 
fragment. and (4) identify the individual bases in 
sequence. e.g.. detect 34 the single. f luorescently 
labeled bases as they flow through a focused laser 
system. Exemplary embodiments of the individual process 
steps are hereinafter discussed. 
Select ion of DNA Fragment to be Sequenced 

In accordance with the present process, a single DNA 
fragment 10a is selected and prepared for labeling and 
analysis. In an exemplary selection process from a 
heterogeneous mixture of DNA fragments, avidin is bound to 
microspheres and a biotinylated probe, complementary to 
some sequence within the desired DNA fragment 10a . is 
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bound to the avidin on the microspheres. The 
avidin-biotinylated probe complex is then mixed with the 
heterogeneous mixture of DNA fragments to hybridize vith 
the desired fragments 10a . The beads are separated from 
the unbound fragments and vashed to provide the desired 
homogeneous DNA fragments 10a . 

The selected fragments are further processed by 
removing the first microsphere and ligating a tail of 
known sequence 9. the primer 12 attached to the 3' end 
of the fragment 10a« Microspheres 40 are prepared vith 
phycoerythrin-avidin and sorted to contain a" single 
molecule of phycoerythrin-avidin. A single complementary 
probe 9a to the Icnown sequence 9. is biotinylated and bound 
to the sorted microspheres 40. The bead-probe complex is 
then hybridized to the selected fragment 10a . Thus, a 
single fragment of DNA 10a will be bound to each 
microsphere. 

In another embodiment, a homogeneous source of DNA 
fragments is provided. e.g. from a gene library. A 
selection step is not then required and the homogeneous 
DNA fragments can be hybridized with the microspheres 40 
containing a single molecule of phycoerythrin-avidin, with 
the appropriate complementary probe attached as above. 

In either case, a single microsphere 40 can now be 
manipulated using, for example, a microinjection pipette 
to transfer a single fragment strand for labeling and 
analysis as discussed below. 
Fluorescence Labeling of Bases 

The bases forming the single fragment to be analyzed 
are provided with identifiable characteristics. The 
identifiable characteristic may attach directly to each 
nucleotide of DNA strand 10a . Alternatively, bases may 
first be modified to obtain individual identifiable 
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Characteristics and cesynthesized to selected strand 10a 
to form a complementary DNA strand. In either event* DNA 
fragment 10 is provided for analysis with identifiable 
bases. 

In one embodiment, a fluorescent characteristic is 
provided. The bases found in DNA do have intrinsic 
fluorescence quantum yields <10'"^ at room 
temperature. In order to detect these bases by a 
fluorescence technique, hovever. it is degirable to modify 
them to form species with large fluorescence quantum 
yields and distiaguishable spectral properties, i.e.. to 
label the bases. 

It is )cno%m how to synthesize a complementary strand 
of DNA vith labeled bases using an enzymatic procedure. 
See. e*g.» P. R« Langer et al.« ''Enzymatic Synthesis of 
Biotin-Labeled Polynucleotides: Novel Nucleic Acid 
Affinity Probes.** Prob. Natl. Aca. Sci. USA 78. S«a3 
(1981); M. L. Shimkus et at.. "Synthesis ■■■^md 
Character ization of Biotin-Labeled Nucleotide Analogs." 
I>NA 5.. 247 (198^) all incorporated herein by refereiiee. 
Referring to Figure 1^ a primer 12 is attached to the 3* 
end of a DNA fragment 10a and an enzyme, e.g.. DNA 
polymerase-Kienoy fragment, is used to synthesize the 
complement to DNA fragment 10a starting from the end of 
primer 12.^ Modified deoxynucleo tides 14. 16. 18.. 22 are 
used in the synthesis (typically modified dATP 14a. dTTP 
(or dUTP) 16a , dCTP 18a , and dGTP 22a). 

Each of the modified nucleotides is formed with a long 
carbon chain linker arm 14b . 16b , 18b , and 22b . 
respectively, terminating in a characteristic fluorescent 
dye 14c , 16c , 18c , and 22c. The modified nucleotides 14 , 
16.* 18.. and 22 are then incorporated into the synthesized 
fragment by DNA polymerase. Th long linker arms 14b , 
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16b , 18b , 22b isolate the fluorescent dye tags 14c , 16c , 
18c . 22c from the bases 14a , 16a , 18a , 22a to permit 
uninhibited enzyme activity. 

DNA fragments several kB long have been synthesized 
with each base containing a carbon chain linker arm 
terminating in biotin as hereinafter described. To 
exemplify the DNA synthesis, tagging, and cleaving 
processes a known strand of DNA nucleotides vas formed « 
nucleotides were tagged vith a linker arm terminating in 
biotin, and a complementary strand of DNA vas synthesized 
from the tagged nucleotides. Biotin was used as a model 
tag rather than fluorescent dyes to demonstrate the 
synthesis and cleavage reactions. 

1. Preparation of known strand [d(A,6)]: 
A polydeozynucleotide« d(A. 6)23^33. was prepared by 
the method outlined in R. L. Ratliff et al., 
"Heteropolynucleotide Synthesis with Terminal 

Deoxyribonucleotidyltransf erase « Biochemistry j6, 851 

(1967) and "Heteropolynucleotides Synthesized with 
Terminal Deoxyribonucleotidyltransf erase . II. Nearest 
Neighbor Frequencies and Extent of Digestion by 
Micrococcal Deoxyr ibonuclease, " Biochemistry 7, 412 

(1968) . The subscript, 2138, refers to the average number 
of bases in the fragment and the comma between the A and 
the G indicates that the bases are incorporated in a 
random order. 

Ten micromoles of the 5 ' -triphosphate of 
2 * -deoxyadenosine (dATP) were mixed with one micromole of 
the 5 ' -triphosphate of 2 • -deoxyguanosine (dGTP) and 5.5 
nanomoles of the linear heptamer of 5 ' -thymidylic acid 
[d(pT)^] which acts as a primer. Ten thousand units of 
terminal transferase were added to the solution which was 
buffered at pH 7 and the reaction mixture was maintained 
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at 37^C for 24 hours. (One unit is defined as the 
amount of enzyme vhich will polymerize 1 nanomole of 
nucleotide in one hour.) The resulting ^(^•^)2138 
then separated from the reaction mixture and purified. 

2. Preparation of biotinylated complementary strand 
Cd<C.U)2i38): 

The complementary' strand of DNA to dC^«G)2138' 
prepared as described above. Was synthesized from 
nucleotides (dCTP) and d(UTP) tagged with biotln. A 
mixture of 10 nanomoles of the biotinylated 
5 * -triphosphate of 2 • -deoxycyttdine (dCTP) and 20 
nanomoles of the biotinylated 5 •-triphosphate of 
2 * -deoxyuridine (dUTP) was added to 10 nanomoles of 
d(A,G)2j^33 and 22 picomoles of d(pT)^. Ten units of 
DNA polymerase (E coli), Klenov fragment « were then added 
to the mixture vhich was buffered at pH 8 and maintained 
at a temperature of 37**C for 2 hours. Analysis of the 
resulting products by electrophoresis demonstrated that 
the reaction vent , to completion and the completely 
biotinylated complementary DNA fragment. dCC.U) 23^33. vas 
formed. 

3. Exonuclease cleavage of biotinylated ^(^'^^213B* 
The completely biotinylated ^^^'^^213B' synthesized 

as described above, vas sequentially cleaved by adding 10 
units of exonuclease III to 5 nanomoles of 
d(A,G)23^33 • biotinylated *(^'">2138- reaction 
mixture vas maintained at pH 8 and 37**C for tvo 
hours. At the end of tvo hours, analysis of the reaction 
mixture shoved that 30% of the DNA vas cleaved and the 
cleavage reaction appeared to be still proceeding. A 
control reaction using normal ^(^•'^^2138 yiel^®<^ 
cleavage in tvo hours. Hence, biotinylation does appear 
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to slow the cleavage reaction using exonucl ase III. but 
the tagged nucleotides were sequentially cleaved from the 

DNA fragments. 

In accordance with the present invention, the selected 
5 fluorescent dyes are substituted for biotin to 
specifically tag each nucleotide type with a., dye 
characteristic of that nucleotide. The resulting 
complementary DNA chain will then provide each base with a 
characteristic. strongly fluorescing dye. By way of 

10 example. Smith et al.. supra. teach a set of four 
individually distinguishable tags. 

The sensitivity for fluorescence detection can be 
increased. if necessary, by attaching several dye 
molecules along the linker arm. Alternatively, large 

15 phycoerythr in-like molecules or even small microspheres 
containing many dye molecules may be attache^ , to .the 
linker arm. In yet another alternative. fluorescent 
labels might be attached to the primary, sing le^^^ s^^^^ 
fragment, thereby eliminating the necessity of forming 

20 labeled bases and synthesizing the complementary istrand. 

It should be noted that DNA fragment lo may be either 
a single or double strand of DNA. A single strand of DNA 
arises where the selected DNA strand Is directly tagged 
for base identification or where the resyntheslzed 

25 complementary tagged DNA strand Is separated from the 
selected strand. A double strand arises where the 
resyntheslzed DNA strand remains combined with the 
selected strand. As used herein. the term -fragment" 
refers to any and all of such conditions. 

30 Enzymatic Cleavage of the Tagg ed Nucleotides 

After DNA fragment 10 is formed with identifiable 
bases and hybridized to microsphere 40. a single fragment 
10 can be manipulated with microsphere 40 and suspended in 
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flow stream 24.. Exonuclease 20 is used to cleave bases 
14a, 16a > 18a . 22a sequentially from single DNA fragment 
10 suspended in flow stream 24. While the presence of the 
linker arm and the fluorescent dye may inhibit the 
enzymatic activity of some exonucleases . suitable 
iexonucleases will cleave vith only a slight reduction in 
rate. Individual bases have been sequentially 

enzymatically cleaved from DNA fragments formed completely 
from biotinylated nucleotides as demonstrated above. 
See, also, e.g.. M. L. Shimkus et al., suora . incorporated 
herein by reference. The rate of cleavage can be adjusted 
by varying the exonuclease concentration, temperature, or 
by the use of poisoning agents. The time to remove one 
base can be made to be on the order of one millisecond. 
See, e.g.. W. E. Razzell et al.. "Studies on 
Polynucleotides.- J. Bio. Chem. 234 No 8. 2105-2112 (1^;5&9). 
Single Molecule Detection 

The individual modified nucleotides 14 • ii* iS.» andfe 22 
are carried by flow stream 24 into flow cell 26 for 
detection and analysis by single molecule detection By^%em 
34.. One embodinkent of a laser-induced fluorescence 
detection system: is described in D. C. Nguyen et al.. 
"Ultrasensitive Fluorescence Detection in 

Hydrodynamicaily Focused Flows," J. Opt. Soc. Am. B. 4. 
138-143. No. 2 (1987), incorporated herein by reference. 
The photomultiplier-based detection system described 
therein has detected single molecules of phycoerythr in in 
focused. flowing sample streams by laser-induced 
fluorescence. See D. C. Nguyen et al., "Detection of 
Single Molecules of Phycoerythr in in Hydrodynamicaily 
Focused Flows by Laser-Induced Fluorescence." Anal, Chem. 
59., 2158-2161 (September 1987). incorporated herein by 
reference. 



wo 89/03432 



PCT/US88/03194 



13 

Phycoerythrln is a large protein containing the 
equivalent of 25 rhodamine-6G dye molecules. The 
detection of single molecules/chromophores of rhodamlne-6G 
and equivalent dye molecules is suggested by system 
5 improvements. Thus. a combination of improved light 
collection efficiency, improved detector quantum 
efficiency, or pulsed excitation and gated detection to 
reduce background noise can be used with the Nguyen et al. 
system* Detection of phycoerythrln was accomplished in 
the 180 us it took the molecule to flow through the 
focused laser beam^ 

In a preferred embodiment of the present process, the 
hydrodynamically focused flow system of Nguyen et al- is 
provided with an improved fluorescence detection system 
described in a copending patent application by Shera. 
"Single Molecule Tracking." Docket No. 65.737. 
incorporated herein by reference. As therein described, 
flow stream 24 providies to flow cell 26 modified 
nucleotides 14. 16. 18. and 22. in the sequence they are 
cleaved from DNA istrand 10. Laser system 32. excites 
fluorescent dyes 14 c . 16c . 18c . and 22c at selected 
wavelengths for identification in laminar sample flow 28. 
within flow cell 26 . 

Fluorescent events contained in optical signal 36 are 
focused by lens 38. position sensitive detector system 
42,. Detector system 42. may comprise a microchannel plate 
(MCP) sensor to output spatial coordinates of observed 
photon events. An internal clock provides a temporal 
coordinate. wherein data processor £4 determines the 
presence of a molecule within flow cell 26.. Molecular 
spectral response to laser T2 excitation enables the 
specific modified nucleotide to be identified. As noted 
by Shera. supra , data handling in the single molecule 
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detection systen 34 effectively provides a aoving sample 
volume within focused flow stream 28 which contains only a 
single tagged nucleotide. System 34, can thus track 
multiple molecules existing within focused flow stream 28 
to enable a high rate of sequencing to be maintained. 

Referring now to Figure 2. there is shown a 
representative output signal from the single molecule 
detection system. The individual nucleotide molecules 14 . 
li. 18 » and 22 are individually cleaved from DMA strand 10 
into flow stream 24. The flow velocity and laminar flow 
conditions maintain the molecules in a train for 
sequential passage through flow cell 2£ and the emitted 
photons from laser-excited molecular fluorescence are 
assigned to individual molecules passing within the cell. 
The characteristic dye for each type nucleotide is 
selected to have an identifiable excitation or 
fluorescence spectrum. This characteristic spectrum <»an 
be used to establish the base sequence for the DNA strand 
being investigated. . 

It will be appreciated that the present process 
further provides a capability to sort the detected 
molecules and deposit them on a moving substrate for 
subsequent identification, e.g., as described in M. R. 
Melamed et al.. -Flow Cytometry and Sorting." Wiley, New 
York (1979). incorporated herein by reference. The flow 
stream maintains the bases spatially isolated in a flow 
stream for presentation to a secondary identification 
device. The position between molecules on the moving 
substrate can be adjustable and can be large enough to 
resolve the sorted molecules by other techniques. 

The foregoing description of the preferred embodiment 
of the inv ntion has been presented for purposes of 
illustration and description. It is not intended to be 
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exhaustlv c to limit the invention to th pc cis £ rm 
disclosed* and obviously many modifications and variations 
are possible in light of the above teaching. The 
embodiment was chosen and described in order to best 
5 explain the principles of the invention and its practical 
application to thereby enable others skilled in the art to 
best utilize the invention in various embodiments and with 
various modifications as are suited to the particular use 
contemplated. It is, intended that the scope of the 
10 invention be defined by the claims appended hereto. 
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WHAT IS CLAIBIED IS 

1. A aetiied for DNA and rna base sequencing, 
comprising the steps of: 

isolating a single fragment of DNA or SNA; 

introducing said single fragment into a moving sample 
stream; 

sequentially cleaving the end base from the DNA or SNA 
fragment with exonuclease to form a train of said bases; 
and 

detecting said bases in said train in sequential 
passage through a det'ectpr. 

2. A method according to Claim l, wherein each said 
base of said single fragment is modified to contain a tag 
having an identifiable characteristic for said base. 

3. A methoiJ accotding to Claim 2. where said bases 
are modified pribc; to said cleavage. 

4. A method according to Claim 2. further including 
the step of enzymatically synthesizing a strand of DNA 
complementary to a DNA or SNA strand to be characterized. 
Where each nucleotide forming said synthesized strand 
contains a tag characteristic of that nucleotide. 
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5. A method according to Claim 2, wherein said tag 
Is separated from the nucleotide by a linker arm effective 
for said cleavage. 

6. A method according to Claim 1« wherein said 
cleaved bases are detected optically. 

7. A method according to Claim 6, wherein each said 
tdg is a fluorescent dye characteristic of one type of 
said nucleotides* 

8. A method according to Claim 7« further including 
the step of exciting each said fluorescent dye and 
detecting the fluorescence spectrum of said dye. 

9. A method according to Claim 1, wherein said step 
of isolating said single fragment of DNA or BNA includes 
the step of hybridizing said fragment to a substrate 
having a site effective for said hybridization, 

10. A method according to Claim 9« further including 
the step of selecting said DNA or RNA fragments from a 
heterogeneous collection of DNA or RNA fragments wherein 
said site is a biotinylated probe effective to hybridize 

5 with DNA or RNA fragments to be selected. 

11. A method according to Claim 9, wherein said 
isolating said single fragment includes the step of 
providing said substrate with a single site effective to 
hybridize with a single DNA fragment. 

12. A method for base sequencing of DNA or RNA 
fragments* comprising the steps of: 

forming said fragments with bases having identifiable 
characteristics; 
5 sequentially cleaving single identifiable bases from a 

single one of said fragments to form a train of said 
identifiable bases; and 

identifying said single, cleaved bases in said train. 
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13. A nethod according to Claim 12, further including 
the step of attaching a characteristic identifiable 
fluorescent dye to each said base. 

14. A method according to Claim 12« wherein the steps 
of forming said fragments includes the steps of forming by- 
enzymatic synthesis a complementary strand of said DNA or 
RNA to be sequenced from said bases having identifiable 
characteristics and thereafter base sequencing said 
complementary strand. 

15. A method according to Claim 14, further including 
the step of attaching a e&araeteristic identif iable 

:. fluorescent dye to each said base. 

16. A method according to Claim 13. wherein said step 
of identifying said single, cleaved bases includes the 
step of exciting each said fluorescent dye and detectjiXtg 
the fluorescence spectrum of said dye. 

.17. A method according to Claim IS, wherein said ste^p 
of identifying said single, cleaved bases includes .-/tg&e 
step of exciting each said fluorescent dye and detecting 
the fluorescence spebtrum of said dye. 

18. A method for DHA or RNA base sequencing, 
comprising the steps of: 

modifying each nucleotide effective for DNA or RNA 
synthesis to attach a fluorescent dye characteristic of 
that nucleotide with a linker arm effective to enable DNA 
or RNA synthesis and exonuclease cleavage; 

synthesizing from said modified nucleotides a strand 
of DNA complementary to a DNA or RNA strand having a base 
sequence to be determined: 

cleaving each said modified nucleotide sequentially 
from a single fragment containing said complementary DNA 
strand: and 

fluorescing each said characteristic dye to identify 
said sequence of nucl otides. 
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19. A method according to Claim 18, wherein th step 
of fluorescing said dyes further comprises the steps of: 

exciting each said modified nucleotide with a laser 
effective to fluoresce said characteristic dye; and 

detecting said fluorescence to sequentially identify 
said nucleotides and generate said sequence of .said DNA or 
RNA. 

20. A method according to Claim 18 . further including 
the step of suspending a single fragment of synthesized 
DNA or. RNA strand in a laminar f lov stream. 

21. A method according to Claim 18* wherein each 
synthesized DNA or RNA fragment is manipulated by 
hybridizing said fragment to a microsphere having a site 
effective for hybridization* 
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